Skip to content

test: cover bench format renderer#306

Merged
ndycode merged 1 commit intomainfrom
test/pr13-bench-format-render-smoke
Mar 23, 2026
Merged

test: cover bench format renderer#306
ndycode merged 1 commit intomainfrom
test/pr13-bench-format-render-smoke

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 22, 2026

Summary

  • add focused coverage for scripts/bench-format/render.mjs
  • verify both markdown report generation and HTML dashboard rendering from a minimal benchmark summary fixture
  • improve coverage for a previously untested benchmark renderer entrypoint module

Validation

  • npm run typecheck
  • npm run lint -- test/bench-format-render.test.ts
  • npm run test -- test/bench-format-render.test.ts
  • manual QA: node --input-type=module -e \"import { buildMarkdownReport, renderDashboardHtml } from './scripts/bench-format/render.mjs'; ...\"

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

adds test/bench-format-render.test.ts — two smoke tests covering the previously untested scripts/bench-format/render.mjs entrypoint, verifying markdown leaderboard output and html dashboard rendering against a minimal fixture. the change is low-risk and improves coverage for a pure render utility with no i/o or token handling.

  • as never cast used on lines 34 and 41 to bypass typescript — this pattern is against the project's explicit anti-pattern policy for type escapes; should be replaced with a typed fixture or as unknown as T
  • failures: [] in the fixture means the ## Failures section branch in buildMarkdownReport is never exercised; an additional test case would bring coverage closer to the 80% threshold required by the project
  • no windows filesystem risk, no token handling, no concurrency concerns in this file — safe in those dimensions

Confidence Score: 4/5

  • safe to merge after fixing the as never type bypass and adding a failures-path test case
  • both issues are p2 style/coverage gaps with clear fixes; no runtime bugs, no security risk, no concurrency issues. the as never anti-pattern is the one concrete thing to resolve before merge per project conventions.
  • test/bench-format-render.test.ts — as never cast and missing failures branch coverage

Important Files Changed

Filename Overview
test/bench-format-render.test.ts adds two smoke tests for buildMarkdownReport and renderDashboardHtml; uses as never type bypass (project anti-pattern); failures branch in buildMarkdownReport is not exercised by the fixture

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[test fixture\nsummary object] --> B[buildMarkdownReport]
    A --> C[renderDashboardHtml]

    B --> D[buildLeaderboardRows\nsort by hashline_v2 accuracy]
    D --> E[markdown string\nleaderboard + timing tables]
    E --> F{assertions}
    F --> F1["toContain('# Code Edit Format Benchmark')"]
    F --> F2["toContain('## Leaderboard (Accuracy First)')"]
    F --> F3["toContain('GPT-5 Codex')"]

    C --> G[buildLeaderboardRows]
    G --> H[JSON.stringify payload\nembedded in script tag]
    H --> I[HTML string]
    I --> J{assertions}
    J --> J1["toContain('<!doctype html>')"]
    J --> J2["toContain('Code Edit Format Benchmark')"]
    J --> J3["toContain('GPT-5 Codex')"]
    J --> J4["toContain('deltaVsReplaceHashline')"]

    style B fill:#1e3a5f,color:#ecf2ff
    style C fill:#1e3a5f,color:#ecf2ff
    style A fill:#0f172b,color:#9cb0d9
Loading

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: test/bench-format-render.test.ts
Line: 34

Comment:
**`as never` bypasses type safety**

`as never` is an unusual cast used here (and on line 41) to silence TypeScript without providing a typed fixture. the project anti-pattern list explicitly forbids `as any`, `@ts-ignore`, and `@ts-expect-error` as type escapes. `as never` achieves the same bypass: asserting the value is `never` makes it assignable to any function parameter.

since `render.mjs` is untyped `.mjs`, the cleanest fix is to cast through `unknown`:

```suggestion
		const markdown = buildMarkdownReport(summary as unknown as Parameters<typeof buildMarkdownReport>[0]);
```

or define a minimal local `BenchmarkSummary` interface and type `summary` against it — either approach avoids the implicit-any pattern the codebase bans.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/bench-format-render.test.ts
Line: 33-47

Comment:
**failures branch not covered**

`buildMarkdownReport` has a distinct code path when `summary.failures.length > 0` (renders `## Failures` section, also has a separate truncation branch at >20 entries). the fixture hardcodes `failures: []` so neither branch is exercised.

given the project's 80% coverage threshold, consider adding one more `it` to cover the failures path:

```ts
it("includes failures section when failures present", () => {
  const withFailures = {
    ...summary,
    failures: [
      { modelId: "gpt-5-codex", mode: "patch", taskId: "task-1", phase: "run", reason: "timeout" },
    ],
  };
  const markdown = buildMarkdownReport(withFailures as never);
  expect(markdown).toContain("## Failures");
  expect(markdown).toContain("timeout");
});
```

this also exercises the graceful `undefined`-mode handling inside the leaderboard builder, which is only reachable with real failure-tagged rows.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "test: cover bench fo..."

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 22, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • skip-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0c326b2a-3209-481a-beac-4652ff7604dd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/pr13-bench-format-render-smoke
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch test/pr13-bench-format-render-smoke

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ndycode ndycode added the passed label Mar 22, 2026
@ndycode ndycode merged commit 43c877f into main Mar 23, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant